Using genetic algorithms for attribute grouping in multivariate microaggregation
نویسندگان
چکیده
Anonymization techniques that provide k-anonymity suffer from loss of quality when the data dimensionality is high. Microaggregation techniques are not an exception. Given a set of records, attributes are grouped into non-intersecting subsets and microaggregated independently. While this improves quality by reducing the loss of information, it usually leads to the loss of the k-anonymity property, increasing entity disclosure risk. In spite of this, grouping attributes is still a common practice for data sets containing a large number of records. Depending on the attributes chosen and their correlation, the amount of information loss and disclosure risk vary. However, there have not been serious attempts to propose a way to find the best way of grouping attribute. In this paper, we present GOMM, the Genetic Optimizer for Multivariate Microaggregation which, as far as we know, represents the first proposal using evolutionary algorithms for this problem. The goal of GOMM is finding the optimal, or near-optimal, attribute grouping taking into account both information loss and disclosure risk. We propose a way to map attribute subsets into a chromosome and a set of new mutation operations for this context. Also, we provide a comprehensive analysis of the operations proposed and we show that, after using our evolutionary approach for different real data sets, we obtain ∗Corresponding author contact information: [email protected], Phone:+34 93 401 6995 and Fax: +34 93 401 7055
منابع مشابه
Tftol: Using Genetic Algorithms for Attribute Grouping in Multivariate Microaggregation U Sing Genetic Aigorithms for Attribute Grouping in M Ultivariate Microaggregation
Acknowledgements Foremost, I would like to thank my daily supervisors Victor Muntés and Jordi Nin for their support and guidance during the development of this project. Without their help, this thesis would have not been possible. Also my gratitude to Josep LluÍs Larriba for giving me the opportunity to develop the project at the DAMA-UPC research group. I would also want to thank the people fr...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملPractical Data-Oriented Microaggregation for Statistical Disclosure Control
ÐMicroaggregation is a statistical disclosure control technique for microdata disseminated in statistical databases. Raw microdata (i.e., individual records or data vectors) are grouped into small aggregates prior to publication. Each aggregate should contain at least k data vectors to prevent disclosure of individual information, where k is a constant value preset by the data protector. No exa...
متن کاملMultivariate and univariate analysis of genetic variation in Iranian summer savory (Satureja hortensis L.) accessions based on morphological traits
In order to evaluate the genetic variation in Iranian summer savory accessions, different accessions were analyzed using multivariate and univariate analysis. Results indicated that there were significant differences in some traits. The mean comparison analysis using least significant difference (LSD) test revealed significant differences among the accessions understudy. In this regard, the hig...
متن کاملA Comparative Study of Microaggregation Methods
Microaggregation is a statistical disclosure control technique for microdata. Raw microdata (i. e. individual records) are grouped into small aggregates prior to publication. Each aggregate should contain at least k records to prevent disclosure of individual information. Fixedsize microaggregation consists of taking fixed-size microaggregates (size k). Data-oriented microaggregation (with vari...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Intell. Data Anal.
دوره 18 شماره
صفحات -
تاریخ انتشار 2014